Constructing Lexical Transducers
نویسنده
چکیده
A lexical transducer, first discussed in Karttunen, Kaplan and Zaenen 1992, is a specialised finite-state automaton that maps inflected surface forms to lexical forms, and vice versa. The lexical form consists of a canonical representation of the word and a sequence of tags that show the morphological characteristics of the form in question and its syntactic category. For example, a lexical transducer for French might relate the surface form veut to the lexical form vouloir+IndPr+SG+P3. In order to map between these two forms, the transducer may contain a path like the one shown in Fig. 1.
منابع مشابه
To appear in the proceedings of Coling-94. CONSTRUCTING LEXICAL TRANSDUCERS
A lexical transducer, first discussed in Karttunen, Kaplan and Zaenen 1992, is a specialised finite-state automaton that maps inflected surface forms to lexical forms, and vice versa. The lexical form consists of a canonical representation of the word and a sequence of tags that show the morphological characteristics of the form in question and its syntactic category. For example, a lexical tra...
متن کاملLexical Analysis of Agglutinative Languages Using a Dictionary of Lemmas and Lexical Transducers
This paper presents a simple method for performing a lexical analysis of agglutinative languages like Korean, which have a heavy morphology. Especially, for nouns and adverbs with regular morphological modifications and/or high productivity, we do not need to artificially construct huge dictionaries of all inflected forms of lemmas. To construct a dictionary of lemmas and lexical transducers, f...
متن کاملNonconcatenative Finite-State Morphology
In the last few years, so called finite.state morphology, in general, and two-level morphology in particular, have become widely accepted as paradigms for the computational t reatment of morphology. Finite-state morphology appeals to the notion of a finite-state transducer, which is simply a classical finite-state automaton whose transitions are labeled with pairs, rather than with single symbo...
متن کاملIncremental construction and maintenance of morphological analysers based on augmented letter transducers
We define deterministic augmented letter transducers (DALTs), a class of finitestate transducers which provide an efficient way of implementing morphological analysers which tokenize their input (i.e., divide texts in tokens or words) as they analyse it, and show how these morphological analysers may be maintained (i.e., how surface form–lexical form transductions may be added or removed from t...
متن کاملA Lexical Interface for Finite-State Syntax
This document describes the lexical interface for nite-state syntax as it is currently implemented and used for the development of the French constraint grammar. The system includes nite-state transducers for multiword expressions, for capitalised, misspelt or unknown words and for accent recovery. It also encodes general multiword expressions such as dates or idioms. The tokeniser includes a n...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1994